risky state
The Best Time for an Update: Risk-Sensitive Minimization of Age-Based Metrics
de Sombre, Wanja, Ortiz, Andrea, Aurzada, Frank, Klein, Anja
Popular methods to quantify transmitted data quality are the Age of Information (AoI), the Query Age of Information (QAoI), and the Age of Incorrect Information (AoII). We consider these metrics in a point-to-point wireless communication system, where the transmitter monitors a process and sends status updates to a receiver. The challenge is to decide on the best time for an update, balancing the transmission energy and the age-based metric at the receiver. Due to the inherent risk of high age-based metric values causing complications such as unstable system states, we introduce the new concept of risky states to denote states with high age-based metric. We use this new notion of risky states to quantify and minimize this risk of experiencing high age-based metrics by directly deriving the frequency of risky states as a novel risk-metric. Building on this foundation, we introduce two risk-sensitive strategies for AoI, QAoI and AoII. The first strategy uses system knowledge, i.e., channel quality and packet arrival probability, to find an optimal strategy that transmits when the age-based metric exceeds a tunable threshold. A lower threshold leads to higher risk-sensitivity. The second strategy uses an enhanced Q-learning approach and balances the age-based metric, the transmission energy and the frequency of risky states without requiring knowledge about the system. Numerical results affirm our risk-sensitive strategies' high effectiveness.
- Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
- Europe > Italy > Lazio > Rome (0.04)
- Asia > Malaysia > Kuala Lumpur > Kuala Lumpur (0.04)
Imitate with Caution: Offline and Online Imitation
As the same itself suggests, almost every species including humans learn by imitating and also improvise. Similarly we can make machines mimic us and learn from a human expert. Autonomous driving is a good example: We can make an agent learn from millions of driver demonstrations and mimic an expert driver. This Learning from demonstrations also known as Imitation Learning (IL) is an emerging field in reinforcement learning and AI in general. The application of IL in robotics is ubiquitous, a robot can learn a policy from analysing demonstrations of the policy performed by a human supervisor.
Minimizing the Negative Side Effects of Planning with Reduced Models
Saisubramanian, Sandhya, Zilberstein, Shlomo
Reduced models of large Markov decision processes accelerate planning by considering a subset of outcomes for each state-action pair. This reduction in reachable states leads to replanning when the agent encounters states without a precomputed action during plan execution. However, not all states are suitable for replanning. In the worst case, the agent may not be able to reach the goal from the newly encountered state. Agents should be better prepared to handle such risky situations and avoid replanning in risky states. Hence, we consider replanning in states that are unsafe for deliberation as a negative side effect of planning with reduced models. While the negative side effects can be minimized by always using the full model, this defeats the purpose of using reduced models. The challenge is to plan with reduced models, but somehow account for the possibility of encountering risky situations. An agent should thus only replan in states that the user has approved as safe for replanning. To that end, we propose planning using a portfolio of reduced models, a planning paradigm that minimizes the negative side effects of planning using reduced models by alternating between different outcome selection approaches. We empirically demonstrate the effectiveness of our approach on three domains: an electric vehicle charging domain using real-world data from a university campus and two benchmark planning problems.
- Transportation > Ground > Road (1.00)
- Automobiles & Trucks (1.00)
- Transportation > Electric Vehicle (0.87)